FAST-LIO pcap record/replay + Virtual Livox#2498
Conversation
…lio_native flake/cmake consuming dimos-module-fastlio2 pointlio branch + Estimator/parameters sources)
…y path) The fast-lio input was pinned to file:///Users/jeffhykin/... which only exists on the Mac. Repoint to the dimensionalOS/dimos-module-fastlio2 pointlio branch on github so the flake builds on Linux. Same locked rev.
Rename the mirrored fastlio_blueprints.py to pointlio_blueprints.py and wire it to PointLio (was incorrectly using FastLio2). Adds mid360-pointlio and mid360-pointlio-voxels to the blueprint registry.
…race-fix flake relock Adds the offline replay/inspection tooling for the pointlio_native module: pcap_replay.hpp (streams a Livox pcap into the SDK callbacks), deterministic_clock + dual-thread replay options, the ruwik2_pt3 replay harness, the pcap_to_db tool (append pointlio_odometry into an existing memory db at ~30Hz, streaming), validated Point-LIO mid360.yaml, and the live smoke-test helper. flake.lock relocks onto the fastlio2 pointlio rev carrying the mtx_buffer race fix; main.cpp comment corrected to attribute the divergence fix to that lock (not the publish-rate gating).
Switches the flake's fast-lio input from path:/home/dimos/repos/dimos-module-fastlio2 to github:dimensionalOS/dimos-module-fastlio2/pointlio so the module builds on any machine without a local fastlio2 clone. Relocked onto rev 7e5d88f (the mtx_buffer race fix); verified pointlio_native builds from the github source.
…hful config OUTPUT-model replay of the Jun-7 hand-shake bag diverged to ~25km (vs hku-mars master's bounded ~5m). Two issues, both fixed here (paired with the dimos-module-fastlio2 pointlio-branch curvature/deque fixes): - main.cpp: heap-use-after-free thread race in bag-frame replay. The feeder thread drives run_main_iter via step() after each feed, but the main thread also drove run_main_iter, so both raced on the shared PCL measurement cloud (ASan: free in sync_packages, read-after-free in ImuProcess::Process). Force serial_replay in bag-frame mode so only the feeder thread touches the EKF. - CMakeLists.txt + flake.nix: the iVox map backend needs glog; add the find_package/link and the nix buildInput. Drop ikd_Tree.cpp from sources (iVox replaces ikd-Tree). - config/mid360.yaml (+ cpp/config duplicate): set satu_acc 5.5->3.0 (master value — accel >=3g saturated, residual zeroed, keeps velocity bounded) and add ivox_grid_resolution=2.0 / ivox_nearby_type=6 (NEARBY6) matching the master config that produced the 5.028m baseline. - replay_ruwik2_pt3.py: make MAX_WALL_SEC env-overridable. After the fix the OUTPUT model stays bounded (peak |pos| 3.898m) on the full bag; ASan reports zero memory errors.
…onsumes the iVox fix The iVox-port + divergence fixes live in dimos-module-fastlio2 @ pointlio commit 02d5066, which is committed locally but NOT pushed (per the task's no-push constraint). The flake's fast-lio input was pointing at github:dimensionalOS/dimos-module-fastlio2/pointlio, whose lock pinned the pre-fix rev 7e5d88f. Since CMakeLists.txt already drops ikd_Tree.cpp (iVox replaces ikd-Tree), building against that stale github source fails with KD_TREE undefined-reference linker errors. Repoint fast-lio at path:/home/dimos/repos/dimos-module-fastlio2 and bump the lock so `nix build .#pointlio_native` consumes the local 02d5066 source. Verified: build exits 0, no KD_TREE errors, 661176-byte binary. NOTE: this bakes a machine-specific absolute path into the lock. Once Repo A's pointlio branch is pushed, switch this input back to github:dimensionalOS/dimos-module-fastlio2/pointlio and bump the lock to 02d5066 for portability.
Now that dimos-module-fastlio2 pointlio (02d5066, the iVox port + divergence fix) is on origin, switch the fast-lio flake input from the local path: pin back to github:dimensionalOS/dimos-module-fastlio2/pointlio and bump the lock to rev 02d5066. This makes jeff/feat/pointlio_native self-contained and portable (no machine-specific path in the lock). Verified: nix build .#pointlio_native exits 0, no KD_TREE errors.
… add --force Extends pcap_to_db.py to populate both pointlio_odometry and pointlio_lidar streams (start-aligned onto the db's earliest ts) so pointlio can be compared against fastlio in one recording. --force overwrites existing pointlio streams. Untracks the machine-specific fastlio_test/setup_network symlinks (enhance overlay) and drops the empty tools/__init__.py.
…o tools - Consolidate config/ to one default.yaml (the tuned Mid-360/iVox config); remove the unused upstream sensor presets (avia, horizon, marsim, ouster64, velodyne, mid360). Module now defaults to default.yaml. - Remove tools/replay_ruwik2_pt3.py (pcap_to_db.py covers offline replay) and tools/demo_live_test.py.
…egistry - --db is now optional: with no existing db, build one from scratch (defaults to <pcap>.db next to the pcap); with an existing db, append + time-align as before. - Rename the internal recorder Rec/RecConfig -> _Rec/_RecConfig so the blueprint-registry generator skips it (a tool helper isn't a public module); fixes test_all_blueprints_is_current. - Docstring: document from-scratch generation using the ruwik2_part3 LFS sample.
… rm cpp/README + config copies; concise comments
- config: consolidate to a single config/default.yaml; drop the ROS-only
lid_topic/imu_topic + odom frame-id/publish/pcd_save keys (parsed-but-unused
by the native binary), and note which params are Mid-360-specific.
- delete cpp/config/{mid360.json,mid360.yaml} (duplicate) and cpp/README.md.
- tighten verbose comments in main.cpp, pcap_replay.hpp, timing.hpp
(comments only; binary rebuilds clean).
… db name in docstring
120s clip (elapsed 55-175s) of the ruwik velocity-spike recording — the data
that diverges through FAST-LIO at the 0.5m pre-KF voxel and is bounded under
Point-LIO. Use via get_data('ruwik2_part3'); pcap_to_db --pcap <it> builds the
db from scratch.
….0 clock bug - module.py docstring: Point-LIO (not FAST-LIO2) + correct import path. - pcap_to_db.py: use 'is not None' (not 'or') for the ts fallback so a real sensor ts of 0.0 isn't replaced by wall time (which would misclassify the stream clock in _resolve_offset).
… NIC Reads a Livox Mid-360 pcap into RAM, rewrites packet timestamps to current-time, and replays point/IMU/status onto a virtual network interface at a configurable rate + delay. Synthesizes the Livox SDK2 control protocol (discovery + GetInternalInfo/FwType ACKs, CRC16/CRC32) so an unmodified consumer (pointlio) handshakes with it as a real sensor. Builds via nix (rustPlatform.buildRustPackage, cargoLock from Cargo.lock).
…bal_map - Cloud now published in the sensor frame (mid360_link): use fastlio2 get_body_cloud() (undistorted scan, no world registration) instead of inverse-transforming the world cloud. No transform in publish_lidar. - Split frames: frame_id (mid360_link) on both the cloud + odometry headers; odom_parent_frame_id (odom) -> odom_frame_id (base_link) for the TF publish. - Remove global_map / voxel_map.hpp entirely (deleted file, config, blueprint + pcap_to_db references). - Bump fast-lio pin to fcbd1c2 (adds get_body_cloud).
get_data() at module level triggered a git-LFS download during blueprint validation (test_blueprint_is_valid), which CI blocks via git-lfs-guard — failing the whole test matrix. Default pcap to empty; resolve the capture path at run time instead.
Port the minimal pcap-replay subsystem from jeff/feat/go2_record into the clean branch so FAST-LIO can run offline from a Mid-360 pcap, matching the Point-LIO pcap_to_db workflow: - cpp: pcap_replay.hpp + timing.hpp (header-only), main.cpp refactored so the main loop runs from either the live SDK or a pcap feeder thread, with an optional deterministic sensor-clock mode. Keeps the clean branch's velocity-cap (guarded set_max_velocity_norm_ms) and flake (velocity-cap fast-lio); does not pull go2_record's tcpdump record path. - module.py: replay_pcap / replay_skip_until_ns / first_packet_marker / deterministic_clock config fields; skip network validation in replay mode. - tools/pcap_to_db.py: replay a pcap through FastLio2 (real-time, non- deterministic) and append fastlio_odometry + fastlio_lidar into an existing memory2 db, time-aligned onto its clock. --force overwrites.
…FastLio2Recorder - livox/pcap_recorder.py: standalone tcpdump pcap capture (LivoxPcapRecorder), decoupled from FAST-LIO. The lidar/SLAM module no longer owns packet capture. - fastlio2/recorder.py: FastLio2Recorder records fastlio_odometry + fastlio_lidar and rewrites ONLY those streams' timestamps onto the db clock (promoted from pcap_to_db's inline recorder; fixes the ts==0.0 falsy-fallback bug). - pcap_to_db.py now imports FastLio2Recorder instead of an inline copy.
FastLio2 no longer produces a global voxel map — odometry + registered lidar only. Removed global_map Out, map config, and the mapping.GlobalPointcloud spec. Updated all consumers (fastlio_blueprints, alfred_nav, g1_onboard, g1_nav_onboard, mobile, pcap_to_db) to drop map args + the global_map remap. Nav blueprints lose their fastlio map; full nav wants a separate mapper wired in (follow-up).
Single space around assignment (was column-aligned, lisp-style) and collapse over-aligned inline comments.
It was a [&]-capturing lambda called from exactly one place (a leftover of the old replay/virtual-clock design). Inline the body into the while loop and drop the now-redundant check_now copy.
… a lambda) Extract the per-iteration body into a plain static function that takes the time point + the rate-limit bookmarks/intervals/flags as explicit args, instead of a [&]-capturing lambda.
…inery Remove _existing_min_ts / _resolve_offset / _resolve_ts / time_offset and the EPS tie-breaker. Records use the base Recorder's ts (msg.ts) directly — the native modules always stamp a real epoch ts, so no re-basing is needed. Drop the now-unused --time-offset from both pcap_to_db tools.
The README was redundant; the tuning comment referenced the removed YAML/_YAML_LAYOUT (it's plain CLI args now).
Drop the jhist pointer and Go2-specific wording; describe acc_cov generically (high vs low IMU-accel trust).
…streams _prepare_streams is now 3 lines: delete the recorder's own two streams if present (keeping any other streams), then record. Removes the force config + the refuse- to-overwrite raise, and --force from both pcap_to_db tools.
…ream_name Name the In ports as the db tables (fastlio_/pointlio_odometry/lidar) and wire them to the module's odometry/lidar outputs with .remappings() in pcap_to_db — matching the base Recorder convention (stream = port name). Removes _stream_name and the odom/lidar_stream_name config; _prepare_streams just replaces our own ports' streams. pcap_to_db drops the now-fixed --odom/lidar-stream-name args.
Move record_tf + @pose_setter_for into Recorder/RecorderConfig and delete tf_recorder.py. Every Recorder now records the live tf stream by default and supports per-stream pose setters; fastlio/pointlio recorders subclass Recorder directly. Drop the now-redundant tf-recorder blueprint entry.
Mirror the fastlio no-yaml change: PointLioConfig tuning fields are passed to the binary as CLI args (read into a PointLioParams struct in main.cpp) instead of being rendered to a throwaway YAML read via --config_path. Removes the yaml-cpp dependency from the flake + CMake and the config-file plumbing from the module. Requires the matching dimos-module-fastlio2 pointlio-branch change (flake.lock bumped to 288e357).
{port_name: db_stream_name} to control the recorded stream/table name without
subclassing — conceptually what .remappings() expresses, but the active
remappings aren't readily accessible from inside the module.
… frame Both modules now expose the same three frames: frame_id (fixed odom, the odometry header + TF parent), child_frame_id (moving body, the odometry child + TF child), and sensor_frame_id (the lidar's own frame, stamped on the published point cloud). get_body_cloud is the undistorted scan in the sensor frame, so the cloud was previously mislabeled — fastlio stamped it with the body frame and pointlio reused frame_id for both the cloud and the odometry header. pointlio's body_start_frame_id/body_frame_id are folded into this scheme. Drops a stale PGO comment.
Use the lidar's concrete frame name rather than the generic FRAME_SENSOR.
…_remapping Move the replace-on-append logic into the base Recorder so it drops exactly the streams it is about to rewrite: the remapped In-port streams (via _stream_name) AND the tf stream. Previously each recorder's _prepare_streams deleted only the raw In-port names, so re-running into an existing db appended a second full copy of the tf tree and would orphan remapped streams. Also guard _record_tf for non-pubsub tf backends, and drop the now-dead world/global_map_fastlio rerun overrides (FastLio2's global_map port was removed in this PR).
Greptile SummaryThis PR adds pcap record/replay capability to FAST-LIO2 via a new
Confidence Score: 5/5Safe to merge — the core data recording and FAST-LIO integration are correct; the only findings are a visualization misalignment in the non-fatal .rrd quick-look and a Nix branch reference that could drift on flake update. The substantial C++ refactor (dropping world-frame cloud, yaml-cpp, voxel map) and the new Python recorder/tool are logically consistent. The shutdown race fix is a genuine correctness improvement. No data integrity issues were found in the recorded DB path. dimos/hardware/sensors/lidar/fastlio2/cpp/flake.nix (upstream branch reference) and dimos/hardware/sensors/lidar/fastlio2/tools/pcap_to_db.py (_write_rrd frame handling). Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant P as pcap file
participant VM as VirtualMid360
participant SDK as Livox SDK (UDP)
participant FL as FastLio2 (C++ binary)
participant REC as FastLio2Recorder
participant DB as SQLite DB (.db)
participant RRD as .rrd (quick-look)
P->>VM: replay pcap packets
VM->>SDK: emit UDP packets on alias NIC
SDK->>FL: on_point_cloud / on_imu_data callbacks
FL->>FL: FAST-LIO IESKF step (get_body_cloud)
FL-->>REC: lidar (sensor-frame PointCloud2) + odometry (body-frame)
REC->>REC: _odom_pose attaches body pose to lidar frame
REC->>DB: append fastlio_odometry + fastlio_lidar
Note over DB,RRD: after pcap drains
DB->>RRD: _write_rrd aggregates cloud + path
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant P as pcap file
participant VM as VirtualMid360
participant SDK as Livox SDK (UDP)
participant FL as FastLio2 (C++ binary)
participant REC as FastLio2Recorder
participant DB as SQLite DB (.db)
participant RRD as .rrd (quick-look)
P->>VM: replay pcap packets
VM->>SDK: emit UDP packets on alias NIC
SDK->>FL: on_point_cloud / on_imu_data callbacks
FL->>FL: FAST-LIO IESKF step (get_body_cloud)
FL-->>REC: lidar (sensor-frame PointCloud2) + odometry (body-frame)
REC->>REC: _odom_pose attaches body pose to lidar frame
REC->>DB: append fastlio_odometry + fastlio_lidar
Note over DB,RRD: after pcap drains
DB->>RRD: _write_rrd aggregates cloud + path
Reviews (5): Last reviewed commit: "Merge branch 'main' into jeff/feat/fastl..." | Re-trigger Greptile |
| last_lidar_max = 0.0 | ||
| first_max: float | None = None | ||
| stagnant_since: float | None = None | ||
| start_time = time.time() | ||
| while True: | ||
| time.sleep(_POLL_SEC) | ||
| odom_cnt, odom_min, odom_max = _odom_stats(db_path, odom_stream) | ||
| if odom_cnt == 0: | ||
| # Stagnation timeout only arms once odometry exists, so bound the | ||
| # no-output wait separately or a dead binary would hang forever. | ||
| if time.time() - start_time > _STARTUP_TIMEOUT_SEC: | ||
| print( | ||
| f"[pcap_to_db] no odometry after {_STARTUP_TIMEOUT_SEC:.0f}s — FAST-LIO " | ||
| "failed to start (check the binary, pcap path, and interface setup).", | ||
| file=sys.stderr, | ||
| flush=True, | ||
| ) | ||
| return False | ||
| continue | ||
| if first_max is None: | ||
| first_max = odom_min | ||
| if max_sensor_sec > 0 and (odom_max - first_max) >= max_sensor_sec: | ||
| print(f"[pcap_to_db] reached --max-sensor-sec={max_sensor_sec:.1f}s", flush=True) | ||
| return True | ||
| _, _, lidar_max = _odom_stats(db_path, lidar_stream) | ||
| if lidar_max == last_lidar_max: | ||
| if stagnant_since is None: | ||
| stagnant_since = time.time() | ||
| elif time.time() - stagnant_since > _STAGNANT_SEC: | ||
| return True | ||
| else: | ||
| last_lidar_max = lidar_max | ||
| stagnant_since = None |
There was a problem hiding this comment.
Stagnation timer arms prematurely on first poll
last_lidar_max is initialised to 0.0, and _odom_stats returns 0.0 for MAX(ts) when the table is empty (via row[2] or 0.0). The very first poll iteration after odometry appears will therefore evaluate lidar_max == last_lidar_max as True, immediately arming stagnant_since. If the recorder's lidar table hasn't been written yet at that moment — which can happen because FAST-LIO rate-limits lidar publishes to pointcloud_freq (default 100 ms) — the 5-second stagnation clock is already running. In the worst case this causes an early exit before a single lidar frame is recorded.
The same pattern exists in dimos/hardware/sensors/lidar/pointlio/scripts/pcap_to_db.py (line 263). Initialising last_lidar_max = None and skipping the stagnation check until at least one non-zero lidar frame has been observed would avoid the false positive.
…_stream_name Braces on every if/for/while (inline single-statement bodies), collapse the awkwardly-wrapped calls/signatures to one line, and break the long run_main_iter def/call with the closing paren on its own line. Inline get_publish_ts (and the Recorder._stream_name helper, now config.stream_remapping.get at the use sites); add the missing braces to pointlio's parse_doubles too.
The pose-setter dict is typed dict[str, Any] (to avoid evaluating the Pose forward-ref at class-definition time), so cast its result back to Pose | None.
The pointlio improvements (no-yaml config, Recorder-subclass rework, frame_id scheme) move to jeff/feat/pointlio_native via merge; this PR carries only the fastlio2 work + shared memory2 Recorder base.
Adds virtual mid360 pcap record/replay to FAST-LIO; rips out the in-process pcap reader and the output-cloud downsampling/denoising.
Usage / Testing
Pcap to DB (also generates .rrd)
# gen <pcap>.db from the pcap (a <db>.rrd quick-look is written next to it) python -m dimos.hardware.sensors.lidar.fastlio2.tools.pcap_to_db \ --db output_mem2.db \ --config overrides.yaml \ --pcap mid360_shake_stairs/mid360_shake_stairs.pcapRecord a Mid-360 pcap
Replay a pcap to a module